A Generalized Framework for Auto-tuning Stencil Computations

نویسندگان

  • Shoaib Kamil
  • Cy Chan
  • Samuel Williams
  • Leonid Oliker
  • John Shalf
  • Mark Howison
  • E. Wes Bethel
چکیده

This work introduces a generalized framework for automatically tuning stencil computations to achieve superior performance on a broad range of multicore architectures. Stencil (nearest-neighbor) based kernels constitute the core of many important scientific applications involving block-structured grids. Auto-tuning systems search over optimization strategies to find the combination of tunable parameters that maximizes computational efficiency for a given algorithmic kernel. Although the auto-tuning strategy has been successfully applied to libraries, generalized stencil kernels are not amenable to packaging as libraries. Studied kernels in this work include both memory-bound kernels as well as a computation-bound bilateral filtering kernel. We introduce a generalized stencil auto-tuning framework that takes a straightforward Fortran expression of a stencil kernel and automatically generates tuned implementations of the kernel in C or Fortran to achieve performance portability across diverse computer architectures.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

PATUS: A Code Generation and Auto-Tuning Framework For Parallel Stencil Computations

PATUS is a code generation and auto-tuning framework for stencil computations targeted at modern multiand many-core processors, such as multicore CPUs and graphics processing units. Its ultimate goals are to provide a means towards productivity and performance on current and future multiand many-core platforms. The framework generates the code for a compute kernel from a specification of the st...

متن کامل

Auto-tuning the 27-point Stencil for Multicore

This study focuses on the key numerical technique of stencil computations, used in many different scientific disciplines, and illustrates how auto-tuning can be used to produce very efficient implementations across a diverse set of current multicore architectures.

متن کامل

An Auto-tuning Jit Compiler for Accelerating Multiple Stencil Computations

We present a JIT compiler with auto-tuning capabilities fusing multiple stencil computations. Data arrays for scientific computing of image processing often exceed cache-memory size. To take advantage of spatial and temporal locality, a common method is to partition the images into tiling blocks for multicore architectures. In realistic scenarios, the multiple image algorithms, most of which ar...

متن کامل

Model-Driven Auto-Tuning of Stencil Computations on GPUs

Stencil computations are a class of algorithms which perform nearest-neighbor computation, often on a multi-dimensional grid. This type of calculation forms the basis for computer simulations across almost every field of science. The increasing computational speed of graphics processing units (GPUs) make their use for stencil computations an interesting goal. However, achieving highly efficient...

متن کامل

Auto-tuning for Energy Usage in Scientific Applications

The power wall has become a dominant impeding factor in the realm of exascale system design. It is therefore important to understand how to most effectively create application software in order to minimize its power usage while maintaining satisfactory levels of performance. In this work, we use existing software and hardware facilities in order to tune applications to minimize for several comb...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009